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A DNA CONSTRUCT ENCODING THE YAP3 SIGNAL PEPTIDE 
FIELD OF INVENTION 

The present invention relates to a DNA construct comprising the 
YAP3 signal peptide for secretion of a heterologous 
5 polypeptide, a yeast cell containing the DNA construct and a 
method of producing heterologous polypeptides in yeast from the 
DNA construct. 

BACKGROUND OF THE INVENTION 

Yeast organisms produce a number of proteins which are 
10 synthesized intracellular^, hut which have a function outside 
the cell. Such extracellular proteins are referred to as 
secreted proteins. These secreted proteins are expressed 
initially inside the cell in a precursor or a pre-protein form 
containing a presequence ensuring effective direction of the 
15 expressed product across the membrane of the endoplasmic 
reticulum (ER) . The presequence, normally named a signal 
peptide, is cleaved off from the rest of the protein during 
translocation. Once entered in the secretory pathway, the 
protein is transported to the Golgi apparatus. From the Golgi 
20 the protein can follow different routes that lead to 
compartments such as the cell vacuole or the cell membrane, or 
it can be routed out of the cell to be secreted to the external 
medium (Pfeffer, s.R. and Rothman, J.E. Ann.Rev.Biochem. 

(1987) , 829-852) . 

25 Several approaches have been suggested for the expression and 
secretion in yeast of proteins heterologous to yeast. European 
published patent application No. 88 632 describes a process by 
which proteins heterologous to yeast are expressed, processed 
and secreted by transforming a yeast organism with an 

30 expression vehicle harbouring DNA encoding the desired protein 
and a signal peptide, preparing a culture of the transformed 
organism, growing the culture and recovering the protein from 
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the culture medium. The signal peptide may be the signal 
peptide of the desired protein itself, a heterologous signal 
peptide or a hybrid of native and heterologous signal peptide. 

A problem encountered with the use of signal peptides hetero- 
5 logous to yeast might be that the heterologous signal peptide 
does not ensure efficient translocation and/or cleavage after 
the signal peptide. 

The cerevisiae MFal (a-f actor) is synthesized as a prepro 
form of 165 amino acids comprising signal-or prepeptide of 19 

10 amino acids followed by a "leader" or propeptide of 64 amino 
aicds, encompassing three N-linked glycosylation sites followed 
by (LysArg(Asp/Glu, Ala ) 2 . 3 a-f actor ) 4 (Kurjan, J. and Herskowitz, 
I. Cell 10 (1982), 933-943). The signal-leader part of the 
preproMFal has been widely employed to obtain synthesis and 

15 secretion of heterologous proteins in cerivisiae. 

Use of signal/leader peptides homologous to yeast is known from 
i.a. US patent specification No. 4,546,082, European published 
patent applications Nos. 116 201, 123 294, 123 544, 163 529, 
and 123 289 and DK patent application No. 3614/83. 

20 In EP 123 289 utilization of the cerevisiae a-factor pre- 
cursor is described whereas WO 84/01153 indicates utilization 
of the Saccharomyces cerevisiae invert ase signal peptide and DK 
3614/83 utilization of the Saccharomvces cerevisiae PH05 signal 
peptide for secretion of foreign proteins. 

25 US patent specification No. 4,546,082, EP 16 201, 123 294, 123 
544, and 163 529 describe processes by which the o-f actor 
signal-leader from Saccharomvces cerevisiae (MFal or MFa2) is 
utilized in the secretion process of expressed heterologous 
proteins in yeast. By fusing a DNA sequence encoding the 

30 cerevisiea MFal signal/leader sequence at the 5 1 end of the 
gene for the desired protein secretion and processing of the 
desired protein was demonstrated. 
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A number of secreted proteins are routed so as to be exposed to 
a proteolytic processing system which can cleave the peptide 
bond at the carboxy end of two consecutive basic amino acids. 
This enzymatic activity is in iL_ cerevisiae encoded by the KEX 
5 2 gene (Julius, D.A. et al., Cell 37 (1984b), 1075). Processing 
of the product by the KEX 2 gene product is needed for the 
secretion of active cerevisiae mating factor a (MFa or a- 
factor) but is not involved in the secretion of active S. 
cerevisiae mating factor a. 

10 The use of the mouse salivary amylase signal peptide (or a 
mutant thereof) to provide secretion of heterologous proteins 
expressed in yeast has been described in WO 89/02463 and WO 
90/10075. It is the object of the present invention to provide 
a more efficient expression and/or secretion in yeast of 

15 heterologous proteins. 

SUMMARY OF THE INVENTION 

It has surprisingly been found that the signal peptide of the 
yeast aspartic protease 3 is capable of providing improved 
secretion of proteins expressed in yeast compared to the mouse 
20 salivary amylase signal peptide. 

Accordingly, the present invention relates to a DNA construct 
comprising the following sequence 

5 • -P-SP- (LP) n -PS-HP-3 ■ 

wherein 

25 P is a promoter sequence, 

SP is a DNA sequence encoding the yeast aspartic protease 3 
(YAP3) signal peptide, 

LP is a DNA sequence encoding a leader peptide, 
n is 0 or 1, 
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PS is a DNA sequence encoding a peptide defining a yeast 
processing site, and 

HP is a DNA sequence encoding a polypeptide which is 
heterologous to a selected host organism. 

5 The term "signal peptide" is understood to mean a p resequence 
which is predominantly hydrophobic in nature and present as an 
N-terminal sequence of the precursor form of an extracellular 
protein expressed in yeast. The function of the signal peptide 
is to allow the heterologous protein to be secreted to enter 

10 the endoplasmic reticulum. The signal peptide is cleaved off in 
the course of this process. The YAP3 signal sequence has been 
reported previously, fused to its native gene (cf. M. Egel- 
Mitani et al., Yeast 6, 1990, pp. 127-137. A DNA construct 
wherein the YAP3 signal sequence is fused to a DNA sequence 

15 encoding a heterologous polypeptide is believed to be novel. 
The YAP3 signal peptide has not previously been reported to 
provide efficient secretion of heterologous polypeptides in 
yeast . 

In the present context, the expression "leader peptide" is 
20 understood to indicate a peptide whose function is to allow the 
heterologous polypeptide to be directed from the endoplasmic 
reticulum to the Golgi apparatus and further to a secretory ve- 
sicle for secretion into the medium, (i.e. export of the 
expressed polypeptide across the cell wall or at least through 
25 the cellular membrane into the periplasmic space of the cell) . 

The expression "heterologous polypeptide" is intended to 
indicate a polypeptide which is not produced by the host yeast 
organism in nature. 

In another aspect, the present invention relates to a 
30 recombinant expression vector comprising the DNA construct of 
the invention. 



WO 95/02059 



PCT/DK94/00281 



In a further aspect, the present invention relates to a cell 
transformed with the recombinant expression vector of the 
invention. 

In a still further aspect, the present invention relates to a 
5 method of producing a heterologous polypeptide, the method 
comprising culturing a cell which is capable of expressing a 
heterologous polypeptide and which is transformed with a DNA 
construct of the invention in a suitable medium to obtain 
expression and secretion of the heterologous polypeptide, after 
10 which the heterologous polypeptide is recovered from the 
medium. 



DETAILED DESCRIPTION OF THE INVENTION 

In a specific embodiment, the YAP3 signal peptide is encoded by 
the following DNA sequence 

15 ATG AAA CTG AAA ACT GTA AGA TCT GCG GTC CTT TCG TCA CTC TTT GCA 
TCT CAG GTC CTT GGC (SEQ ID No:l) 

or a suitable modification thereof encoding a peptide with a 
high degree of homology (at least 60%, more preferably at least 
70%, sequence identity) to the YAP3 signal peptide. Examples of 

20 suitable modifications" are nucleotide substitutions which do 
not give rise to another amino acid sequence of the peptide, 
but which may correspond to the codon usage of the yeast 
organism into which the DNA sequence is introduced, or 
nucleotide substitutions which do give rise to a different 

25 amino acid sequence of the peptide (although the amino acid 
sequence should not modified to the extent that it is no longer 
able to function as a signal peptide) . other examples of 
possible modifications are insertion of three or multiples of 
three nucleotides at either end of or within the sequence, or 

30 deletion of three or multiples of three nucleotides at either 
end of or within the sequence. 
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In the sequence 5 , -P-SP«(LP) n -PS-HP-3 1 , n is preferably l. In 
other words, although the YAP3 signal peptide may, in some 
instances, in itself provide secretion and/or processing of the 
heterologous polypeptide, a leader or pro-peptide sequence is 
5 preferably present. The leader may be a yeast MFal leader 
peptide or a synthetic leader peptide, e.g. one of the leader 
peptides disclosed in WO 89/02463 or WO 92/11378 or a 
derivative thereof capable of effecting secretion of a 
heterologous polypeptide in yeast. The term "synthetic" is 
10 intended to indicate that the leader peptides in question are 
not found in nature. Synthetic yeast leader peptides may, for 
instance be constructed according to the procedures described 
in WO 89/02463 or WO 92/11378. 

The yeast processing site encoded by the DNA sequence PS may 
15 suitably be any paired combination of Lys and Arg, such as Lys- 
Arg, Arg-Lys, Lys-Lys or Arg-Arg, which permits processing of 
the heterologous polypeptide by the KEX2 protease of 
Saccfrayomyces cerevisiae or the equivalent protease in other 
yeast species (D.A. Julius et al., Cell 37. 1984, 1075 ff.). If 
20 KEX2 processing is not convenient, e.g. if it would lead to 
cleavage of the polypeptide product, a processing site for 
another protease may be selected instead comprising an amino 
acid combination which is not found in the polypeptide product, 
e.g. the processing site for FX a , Ile-Glu-Gly-Arg (cf . Sambrook, 
25 Fritsch and Maniatis, Molecular Cloning; A Laboratory Manual . 
Cold Spring Harbor, New York, 1989) . 

The heterologous protein produced by the method of the inven- 
tion may be any protein which may advantageously be produced in 
yeast. Examples of such proteins are aprotinin, tissue factor 

30 pathway inhibitor or other protease inhibitors, insulin or 
insulin precursors, human or bovine growth hormone, 
interleukin, glucagon, tissue plasminogen activator, 
transforming growth factor a or 0, platelet-derived growth 
factor, enzymes, or a functional analogue thereof. In the 

35 present context, the term "functional analogue" is meant to 
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indicate a polypeptide with a similar function as the native 
protein (this is intended to be understood as relating to the 
nature rather than the level of biological activity of the 
native protein) . The polypeptide may be structurally similar to 
5 the native protein and may be derived from the native protein 
by addition of one or more amino acids to either or both the C- 
and N- terminal end of the native protein, substitution of one 
or more amino acids at one or a number of different sites in 
the native amino acid sequence, deletion of one or more amino 
10 acids at either or both ends of the native protein or at one or 
several sites in the amino acid sequence, or insertion of one 
or more amino acids at one or more sites in the native amino 
acid sequence. Such modifications are well known for several of 
the proteins mentioned above. 

15 The DNA construct of the invention may be prepared 
synthetically by established standard methods, e.g. the 
phosphoamidite method described by S.L. Beaucage and M.H. 
Caruthers, Tetrahedron Letters 99 f 1981, pp. 1859-1869, or the 
method described by Matthes et al., EMBO Journal 3 . 1984, pp. 

20 801-805. According to the phosphoamidite method, 
oligonucleotides are synthesized, e.g. in an automatic DNA 
synthesizer, purified, annealed, ligated and cloned into the 
yeast expression vector. It should be noted that the sequence 
5»-P-SP-(LP) n -PS-HP-3' need not be prepared in a single 

25 operation , but may be assembled from two or more 
oligonucleotides prepared synthetically in this fashion. 

One or more parts of the DNA sequence 5 f -P-SP-(LP) n -PS-HP-3 1 may 
also be of genomic or cDNA origin, for instance obtained by 
preparing a genomic or cDNA library and screening for DNA 

30 sequences coding for said parts (typically HP) by hybridization 
using synthetic oligonucleotide probes in accordance with 
standard techniques (cf. Sambrook, Fritsch and Maniatis, 
Molecular Cloning: A Labora tory Manual r cold Spring Harbor, New 
York, 1989) . in this case, a genomic or cDNA sequence encoding 

35 a signal peptide may be joined to a genomic or cDNA sequence 
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encoding the heterologous protein, after which the DNA sequence 
may be modified by the insertion of synthetic oligonucleotides 
encoding the sequence S'-P-SP-fLP^-PS-HP-S ■ in accordance with 
well-known procedures. 

5 Finally, the DNA sequence 5 f -P-SP-(LP) n -PS-HP-3 » may be of mixed 
synthetic and genomic, mixed synthetic and cDNA or mixed 
genomic and cDNA origin prepared by annealing fragments of 
synthetic, genomic or cDNA origin (as appropriate) , the 
fragments corresponding to various parts of the entire DNA 
10 sequence, in accordance with standard techniques. Thus, it may 
be envisaged that the DNA sequence encoding the signal peptide 
or the heterologous polypeptide may be of genomic or cDNA 
origin, while the sequence 5 '-P-SP-(LP) n -ps may be prepared 
synthetically . 

15 The recombinant expression vector carrying the sequence 5 f -P- 
SP-(LP) n -PS-HP-3» may be any vector which is capable of 
replicating in yeast organisms. In the vector, the promoter 
sequence (P) may be any DNA sequence which shows 
transcriptional activity in yeast and may be derived from genes 

20 encoding proteins either homologous or heterologous to yeast. 
The promoter is preferably derived from a gene encoding a 
protein homologous to yeast. Examples of suitable promoters are 
thB Saccharomvces ce revisiae MFal, TPI, ADH I, ADH II or PGK 
promoters, or corresponding promoters from other yeast species, 

25 e *9* Schizosaccharomyees pombe . Examples of suitable promoters 
are described by, for instance, Russell and Hall, J. Biol. 
Chem^ 15£, 1983, pp. 143-149; Russell, Nature 301 . 1983, pp. 
167-169; Ammerer, Meth. Enzvmol, 101, 1983, pp. 192-201; 
Russell et al., J. Biol, Chem. 258 . 1983, pp. 2674-2682; 

30 Hitzeman et al, J. Biol. Chem, 225 , 1980, pp. 12073-12080; 
Kawasaki and Fraenkel, Biochem. Biophvs. Res. Comm. 108 . 1982, 
and T. Alber and G. Kawasaki, J. Mol- AppI. Genet, l . 1982, pp. 
419-434. 
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The sequences indicated above should also be operably connected 
to a suitable terminator, e.g. the TPI terminator (cf . T. Alber 
and G. Kawasaki, J. Mol . Anpi . sen>>t . i , 1982 , pp. 419-434), or 
the yeast CYC1 terminator. 

5 The recombinant expression vector of the invention further 
comprises a DNA sequence enabling the vector to replicate in 
yeast. Examples of such sequences are the yeast plasmid 2/x 
replication genes REP 1-3 and origin of replication. The vector 
may also comprise a selectable marker, e.g. the schizo- 
10 saccharomvces pomhe TPI gene as described by P.R. Russell, Gene 
10, 1985, pp. 125-130, or the yeast URA3 gene. 

The procedures used to insert the sequence 5 '-P-SP-(LP) n -PS-HP- 
3* into a suitable yeast vector containing the information 
necessary for yeast replication, are well known to persons 

15 skilled in the art (cf., for instance, Sambrook, Fritsch and 
Maniatis, op.cAt, ) • It will be understood that the vector may 
be constructed either by first preparing a DNA construct 
containing the entire sequence and subsequently inserting this 
fragment into a suitable expression vector, or by sequentially 

20 inserting DNA fragments containing genetic information for the 
individual elements (such as the promoter sequence, the signal 
sequence, the leader sequence, or DNA coding for the 
heterologous polypeptide) followed by ligation. 

The yeast organism transformed with the vector of the invention 
25 may be any suitable yeast organism which, on cultivation, pro- 
duces large amounts of the heterologous polypeptide in 
question. Examples of suitable yeast organisms may be strains 
of Sacchayomyces , such as Saccharomvces cerevi «i sac- 

charomyces kluyverj, or Saccharomvces u varum. 

30 Schizosaccharomyces , such as Schizosaccharomvces nnmh» 
Kluyveromycpf? , such as Kluwemmynnc lactis . Yarrowia . such as 
Yarrowia lipolytic^, or Hansenula . such as Hansenula 
E2lyjS2£Eiia. The transformation of the yeast cells may for 
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instance be effected by protoplast formation followed by 
transformation in a manner known per se . 

The medium used to cultivate the cells may be any conventional 
medium suitable for growing yeast organisms. The secreted 
5 heterologous protein, a significant proportion of which will be 
present in the medium in correctly processed form, may be 
recovered from the medium by conventional procedures including 
separating the yeast cells from the medium by centrifugation or 
filtration, precipitating the proteinaceous components of the 
10 supernatant or filtrate by means of a salt, e.g. ammonium 
sulphate, followed by purification by a variety of 
chromatographic procedures, e.g. ion exchange chromatography, 
affinity chromatography, or the like. 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 The invention is further described in the following examples 
with reference to the appended drawings wherein 
Fig. 1A and IB schematically show the construction of plasmid 
pLaC257; 

Fig. 2 shows the DNA sequence and derived amino acid sequence 
20 of the EcoRI-Xbal insert in pLaC257 (SEQ ID No: 2) ; 

Fig. 3A and 3B schematically show the construction of plasmid 
pLaC242Apr; 

Fig. 4 shows the DNA sequence and derived amino acid sequence 
of the EcoRI-Xbal fragment of pAPRScl, wherein the protein 
25 sequence shown in italics is derived from the random expression 
cloned DNA fragment (SEQ ID No: 4) ; 

Fig. 5 schematically shows the construction of plasmid pLaC263; 

Fig. 6 shows the DNA sequence and derived amino acid sequence 
of the EcoRI-Xbal fragment of pLaC263 (SEQ ID No: 6); 
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Fig. 7A and 7B show the DNA sequence and derived amino acid 
sequence of human tissue factor pathway inhibitor (TFPI) 
including its native signal peptide (SEQ ID No: 8) 

Fig. 8A shows the DNA sequence and derived amino acid sequence 
of the spx3 signal peptide and 212 leader peptide (shown in WO 
89/02463) N-terminally fused to the TFPI sequence in plasmid 
pYES-212 TFPI161-117Q (SEQ ID No: 10); 

Fig. 8B shows the DNA sequence and derived amino acid sequence 
of the YAP3 signal peptide and 212 leader peptide N-terminally 
fused to the TFPI sequence in plasmid pYES-yk TFPI161-117Q 
(SEQ ID No: 12) ; and 

Fig. 9 shows restriction maps of plasmids pYES21, pP- 
212TFPI161-117Q; pYES-212TFPI161-117Q and pYES-ykTFPI161-117Q. 

The invention is further illustrated in the following examples 
which are not in any way intended to limit the scope of the 
invention as claimed. 

EXAMPLES 



Plasmid s and DNA materia l g 

All expression plasmids contain 2\l DNA sequences for 
replication in yeast and use either the cerevisiae URA3 gene 
or the Schizosaccharomyces pombe triose phosphate isomerase 
gene (POT) as selectable markers in yeast. POT plasmids are 
described in EP patent application No. 171 142. A plasmid 
containing the POT-gene is available from a deposited E. coli 
strain (ATCC 39685) . The POT plasmids furthermore contain the 
S. ceyrevisjae triose phosphate isomerase promoter and 
terminator (P TP! and T TPI ). They are identical to pMT742 (M. Egel- 
Mitani et al., Gene 23, 1988, pp. 113-120) (see fig. 1) except 
for the region defined by the Sph-Xbal restriction sites 
encompassing the P TPI and the coding region for 
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signal/leader/product:. The URA3 plasmide use P Tp , and the iso-I- 
cytochrome C terminator (T^) . 

The P TPI has been modified with respect to the sequence found in 
pMT742, only in order to facilitate construction work. An 
5 internal SphI restriction site has been eliminated by SphI 
cleavage, removel of single stranded tails and religation. 
Furthermore, DNA sequences, upstream to and without any impact 
on the promoter, have been removed by Bal31 exonuclease 
treatment followed by addition of an SphI restriction site 
10 linker. This promoter construction present on a 373 bp Sphl- 
EcoRI fragment is designated P TP15 and when used in plasmids 
already described this promoter modification is indicated by 
the addition of a 6 to the plasmid name. 

* 

Finally a number of synthetic DNA fragments have been employed 
15 all of which were synthesized on an automatic DNA synthesizer 

(Applied Biosystems model 380A) using phosphoramidite chemistry 

and commercially available reagents (S.L. Beaucage and M.H. 

Caruthers (1981) Tetrahedron Letters 22., 1859-1869). The 

oligonucleotides were purified by polyacrylamide gel 
20 electrophoresis under denaturing conditions. Prior to annealing 

complementary pairs of such DNA single strands these were 

kinased by T4 polynucleotide kinase and ATP. 

All other methods and materials used are common state of the 
art knowledge (J. Sambrook et al., Molecular Cloning, A 
25 Laboratory Manual, Cold Spring Harbor Laboratory Press) Cold 
Spring Harbor, N.Y. 1989). 

Example l 

The modified mouse salivary amylase signal peptide (MSA3 SP ) 
(described in WO 89/02463) of the expression cassette of 
30 plasmid pLSC6315D3 (described in Example 3 of WO 92/11378) 
which contains a DNA sequence coding for the insulin precursor 
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MI3 (B(l-29)-Ala-Ala-Lys-A(l-21)) , was replaced with the YAP3 
signal peptide in the following steps: 

A construct for easy exchange of signal peptides was made. 
Through site-directed mutagenesis an Asp718 site was introduced 
5 just prior to the signal initiation codon in pLaCl965 (cf . WO 
89/02463, fig. 5), by the double primer method applying a 
mutagenic primer NOR494: 

3 1 •ATTTGCTGCCATGGTACTTTCAGAAGG (SEQ ID No: 14) 

where bold letters indicate mutations and the underlined 
10 sequence indicates the initiation codon. 

The resulting plasmid was termed pLaC1965-Asp7l8 (see Fig. 1) . 

The nucleotide sequence of the region covering the junction 
between signal peptide and leader peptide of the expression 
cassette in pLSC63l5D3 was modified, by replacing the Apal- 
15 HgiAI restriction fragment with a synthetic DNA stretch, NOR 

2521/2522: 

NOR2521: 5'-CAA CCA ATA GAC ACG CGT AAA GAA GGC CTA 

CAG CAT GAT TAC GAT ACA GAG ATC TTG GAG (SEQ 
ID No: 15) 

20 NOR2522: 5«-C CAA GAT CTC TGT ATC GTA ATC ATG CTG TAG 

GCC TTC TTT ACG CGT GTC TAT TGG TTG GGC C (SEQ 
ID No: 16) 

The resulting plasmid was termed pLSC6315D3R (see Fig. 1) . 

The Sphl-Asp718 fragment of pLaC196«-Asp718 was ligated with 
25 Sphl-Mlul cut pLSC63l5D3R plasmid and a synthetic stretch of 
DNA encoding the YAP3 signal peptide: 



YAP-spl: 5--GT ACC AAA ATA ATG AAA CTG AAA ACT GTA AGA 
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TCT GCG GTC CTT TCG TCA CTC TTT GCA TCT CAG 
GTC CTT GGC CAA CCA ATA GAC A (SEQ ID No: 17) 

YAP-sp2: 5 f -CG CGT GTC TAT TGG TTG GCC AAG GAC CTG AGA TGC 
AAA GAG TGA CGA AAG GAC CGC AGA TCT TAC 
5 AGT TTT CAG TTT CTA TAT TTT G (SEQ ID No: 18) 

The resulting plasmid pLaC257 essentially consists of 
PLSC6315D3, in which the MSA3 signal peptide has been replaced 
by the YAP3 signal peptide (see Fig. 2). 

Yeast transformation: £. cerevisiae strain MT663 (E2-7B XE11-36 
10 a/a, Atpi/Atpi, pep 4-3/pep 4-3) (the yeast strain MT663 was 
deposited in the Deutsche Sammlung von Mikroorganismen und 
Zellkulturen in connection with filing WO 92/11378 and was 
given the deposit number DSM 6278) was grown on YPGaL (1% Bacto 
yeast extract, 2% Bacto peptone, 2% galactose, 1% lactate) to 
15 an O.D. at 600 run of 0.6. 



100 ml of culture was harvested by centrifugation, washed with 
10 ml of water, recentrifugated and resuspended in 10 ml of a 
solution containing 1.2 M sorbitol, 25 mM Na^DTA pH » 8.0 and 
6.7 mg/ml dithiotreitol . The suspension was incubated at 30 °C 

20 for 15 minutes, centrifuged and the cells resuspended in 10 ml 
of a solution containing 1.2 M sorbitol, 10 mM Na 2 EDTA, 0.1 M 
sodium citrate, pH 0 5.8, and 2 mg Novozym*234. The suspension 
was incubated at 30 'C for 30 minutes, the cells collected by 
centrifugation, washed in 10 ml of 1.2 M sorbitol and 10 ml of 

25 CAS (1.2 M sorbitol, 10 mM CaCl 2 , 10 mM Tris HC1 (Tris = 
Tris(hydroxymethyl)aminomethane) pH = 7.5) and resuspended in 
2 ml of CAS. For transformation, 1 ml of CAS-suspended cells 
was mixed with approx. 0.1 /ig of plasmid pLaC257 and left at 
room temperature for 15 minutes. 1 ml of (20% polyethylene 

30 glycol 4000, 20 mM CaCl 2 , 10 mM CaCl 2 , 10 mM Tris HC1, pH = 7.5) 
was added and the mixture left for a further 30 minutes at room 
temperature. The mixture was centrifuged and the pellet 
resuspended in 0.1 ml of SOS (1.2 M sorbitol, 33% v/v YPD, 6.7 
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mM CaCl 2 , 14 Mg/ml leucine) and incubated at 30 °C for 2 hours. 
The suspension was then centrifuged and the pellet resuspended 
in 0.5 ml of 1.2 M sorbitol. Then, 6 ml of top agar (the SC 
medium of Sherman et al., Methods in Yeast Genetics . Cold 
5 Spring Harbor Laboratory (1982)) containing 1.2 M sorbitol plus 
2.5%agar) at 52 *C was added and the suspension poured on top of 
plates containing the same agar-solidif ied, sorbitol containing 
medium. 

Transformant colonies were picked after 3 days at 30 °C, 
10 reisolated and used to start liquid cultures. One transformant 
was selected for further characterization. 

Fermentation: Yeast strain MT663 transformed with plasmid 
pLaC257 was grown on YPD medium (1% yeast extract, 2% peptone 
(from Difco Laboratories) , and 3% glucose) . A 1 liter culture 
15 of the strain was shaken at 30 °C to an optical density at 650 
nm of 24. After centrifugation the supernatant was isolated. 

MT663 cells transformed with plasmid pLSC6315D3 and cultured as 
described above were used for a comparison of yields of MI3 
insulin precursor. Yields of MI3 were determined directly on 
10 culture supernatants by the method of Snel, Damgaard and 
Mollerup, Chromatographic 2A, 1987, pp. 329-332. The results 
are shown below. 



plasmid 



MI3 yield 



25 



PSLC63.15D3 (Msa3 sp ) 
pLaC257 (YAP3) 



120% 



100% 



Example 2 



Plasmid pLSC6315D3 was modified in two steps. First, the MSA3 
signal peptide was replaced by the spx3 signal peptide by 
exchanging the Sphl-Apal fragment with the analogous fragment 
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from P LaC212spx3 (cf. wo 89/02463). From the resulting plasmid 
pSLC63.l5spx3, a 302bp EcoRl-Ddel fragment was isolated and 
fused to the 204 bp Ncol-Xbal fragment of P KFN1003 (WO 
90/10075) containing the DNA sequence encoding aprotinin via a 
5 synthetic linker DNA, NOR2101/2100 (see Pig. 3) 

NOR2101: 5'-T AAC GTC GC (SEQ ID No: 19) 
NOR2100: 5 '-CAT GGC GAC G (SEQ ID No:20) 

The resulting plasmid, P LaC242-Apr (see Fig. 3), was cleaved 
with Clal, dephosphorylated and applied in cloning of random 
10 5'-CG-overhang fragments of DNA isolated from s. cerevisiae 
strain MT663, according to the description in wo 92/11378. 
Transformation and fermentation of yeast strain MT663 was 
carried out as described in Example l. 

Prom the resulting library yeast transformants harbouring the 
15 plasmid pAFR-Sci (prepared by the method described in WO 
92/11378) containing a leader the sequence of which is given in 
Pig. 4, was selected by screening. The spx3 signal peptide of 
pAPR-Scl was replaced by the YAP3 signal peptide by fusing the 
Sphl-Styl fragment from P LaC257 with the 300 bp Nhel-Xbal 
20 fragment of pAPR-Scl via the synthetic linker DNA MH1338/1339 
(see Fig. 5) : 

MH 1338: 5--CTT GGC CAA CCA TCG AAA TTG AAA CCA G (SEQ ID 
No: 21) 

MH 1339: 5«-CT AGC TGG TTT CAA TTT CGA TGG TTG GC (SEQ ID 
25 No:22) 



The resulting plasmid was termed P LaC263 (see Fig. 5) . The DNA 
sequence and derived amino acid sequence of the EcoRI-Xbal 
fragment of pLaC263 appears from Fig. 6. 
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plasmid aprotinin yield 

pAPR-Scl (Spx3sp) 100% 
pLaC263 136% 

Example 3 

5 A synthetic gene coding for human TFPI, the DNA sequence of 
which was derived from the published sequence of a cDNA coding 
for human tissue factor pathway inhibitor (TFPI) (Wun et al., 
J. Biol. Chem. 212 (1988) 6001-6004), was prepared by step-wise 
cloning of synthetic restriction fragments into plasmid pBS(+). 

10 The resulting gene was contained on a 928 base pair (bp) Sail 
restriction fragment. The gene had 26 silent nucleotide 
substitutions in degenerate codons as compared to the cDNA 
resulting in fourteen unique restriction endonuclease sites. 
The DNA sequence of the 928 bp Sail fragment and the 

15 corresponding amino acid sequence of human TFPI (pre-form) is 
shown in Fig. 7 (SEQ ID No:8). 

This DNA sequence was subsequently truncated to code for a TFPI 
variant composed of the first 161 amino acids. A non- 
glycosylated variant, TFPl v16l -H7Gln in which the AAT-codon for 

20 Asnll7 was replaced by CAA coding for Gin was constructed by 
site-directed mutagenesis in a manner known per se using 
synthetic oligonucleotides. The DNA sequence encoding TFPI V161 - 
117Gln was preceded by the synthetic signal-leader sequence 
212spx3 (cf. WO 89/02463), see Fig. 8A. This construction was 

25 inserted into the plasmid pP-212TFPH61-117Q (based on a vector 
of the POT-type (G. Kawasaki and L. Bell, US patent 4,931,373), 
cf. Fig. 8). 

A 1.1 kb Sphl-Xbal fragment containing the coding region for 
212spx3-TFPI 1 . 161 -H7Gln was isolated and cloned into the plasmid 
30 pYES21 derived from the commercially available (Stratagene) 
vector pYES2.0 (cf. Fig. 8). This plasmid contains 2m sequence 
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for replication in yeast , the yeast URA3 gene for plasmid 
selection in ura3 strains, the ^-lactamase gene for selection 
in E. coli. the ColEl origin of replication for replication in 
E. coli, the fl origin for recover of single-stranded DNA 
5 plasmid from superinfected E. coli strains, and the yeast CYC1 
transcriptional terminator. The Sphl-Xbal fragment was cloned 
into pYES 2.0 in front of the CYC1 terminator. The resulting 
plasmid pYES-212TFPI161-117Q (cf. Fig. 9) was cleaved with 
PflMI and EcoRI to remove the coding region for the mouse 
10 salivary amylase signal peptide which was replaced by a double- 
stranded synthetic oligonucleotide sequence coding for the YAP3 
signal peptide: 

MHJ 1131 5 1 AAT TCA AAC TAA AAA ATG AAG CTT AAA ACT GTA AGA 
TCT GCG GTC CTT TCG TCA CTC TTT GCA TCG CAG GTC CTA GGT CAA CCA 
15 GTC A (SEQ ID No:23) 

MHJ 1132 S'CTG GTT GAC CTA GGA CCT GCG ATG CAA AGA GTG ACG 
AAA GGA CCG CAG ATC TTA CAG TTT TAA GCT TCA TTT TTT AGT TTG 
(SEQ ID No: 24) 

resulting in plasmid pYES-ykTFPI161-117Q (cf . Fig. 8B and Fig. 
20 9) . 

Plasmids pYES-212TFPI161-117Q and pYES-ykTFPI161-117Q were 
transformed into the haploid yeast strain YNG318 f MATa ura3-52 
leu2-42 pep4-Jl h is4-539 rcir+f . Plasmid selection was for Ura+ 
cells. Reisolated transf ormants were grown in 50 ml of 

25 synthetic complete medium lacking uracil (SC-ura) for 3 days at 
30 °C. After measuring cell density (OD^g) , the cultures were 
centrifuged and the resulting supernatants were analysed for 
the level of secreted FXa/TF/FVlla-dependent chromogenic TFPI- 
activity (P.M. Sandset et al., Thromb.Res. 47, 1987, pp. 389- 

30 400) . The mean activity measured for supernatants from strains 
containing plasmid pYES-212TFPH61-117Q (i.e. the plasmid 
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containing the mouse salivary amylase signal sequence) was 0.65 
U/ml»OD. The mean activity measured for supernatants from 
strains containing plasmid pYES-ykTFPI161-117Q was 1.00 
U/ml»OD. 



WO 95/02059 PCT/DK94/00281 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME; N ovo Nordisk A/S 
5 (B) STREET: Novo Alle 

(C) CITY: Bagsvaerd 

(E) OOUTCIKY: Denmark 

(F) POSTAL CODE (ZIP) : 2880 

(G) TELEPHONE: +45 4444 8888 
10 (H) TELEFAX: +45 4449 3256 

(ii) TTTIE OF INVENTION: A ENA Construct Encoding the YAP3 Signal 



(iii) NUMBER OF SEQUENCES: 24 

15 (iv) OCMTOTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) OCMEOTER: IBM PC ccarpatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFIWARE: Patentln Release #1.0, Version #1.25 (EPO) 

20 (2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) ISX3IH: 63 base pairs 

(B) TOTE: nucleic acid 

(C) STRANEEENESS: single 
25 (D) TOFOIOGY: linear 

(ii) MDIECUIE TYPE: cENA 

(iii) HYPOIHEnCAL: NO 

(iii) ANTI-SENSE: NO 

(Vi) ORIGINAL SOURCE: 
30 (A) ORGANISM: Saccharanyces cerevisiae 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ATCAAACLGA AAACICTAAG ATCIGOQGTC CLTTOCTCAC TCTELGCATC TCAGGTOCIT 60 

63 

(2) INFORMATION FOR SEQ ID NO: 2: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) IfNGIH: 476 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ENA 

(iii) HVK71HLT1CAL: NO 

(iii) ANTI-SENSE: WO 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(ix) FEATURE: 

(A) NAMEyKEY: CDS 

(B) IOCAITCN: 81. ,452 

10 (ix) EEAIURE: 

(A) NAME/KEY: siqjpeptide 

(B) IOCAITGN: 81.. 293 

(ix) FEATURE: 

(A) NAME/KEY: matjeptide 
35 (B) LOCATION: 294.. 452 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GAAITCAITC AAGAATAGTT CAAAGAAGAA GAITACAAAC TATCAATITC ATACACAATA 60 

TAAAOGAOGG TAOCAAAATA ATC AAA CIG AAA ACT GTA AGA TCT GOG GTC 110 

Met lys Leu lys Thr Val Arg Ser Ala Val 
20 -7i -70 



CTT TOG TCA CTC TTT GCA TCT CAG GTC CTT GGC CAA CCA ATA GAC AOG 158 
leu Ser Ser Leu Hie Ala Ser Gin Val Leu Gly Gin Pro lie Asp Thr 
-60 -55 -50 

OCT AAA GAA GGC CEA CAG CAT GAT TAC GAT ACA GAG ATC TIG GAG CAC 206 
25 Arg lys Glu Gly Leu Gin His Asp Tyr Asp Thr Glu lie Leu Glu His 
-45 -40 -35 -30 

ATT GGA AGC GAT GAG TT3V ATT TIG AAT GAA GAG TAT GTT ATT GAA AGA 254 
lie Gly Ser Asp Glu Leu lie Leu Asn Glu Glu Tyr Val lie Glu Arg 

-25 -20 -15 

30 ACT TIG CAA GOC ATC GAT AAC AOC ACT TIG GCT AAG AGA TTC GET AAC 302 
Thr Ifiu Gin Ala lie Asp Asn Thr Thr Leu Ala Lys Arg Hie Val Asn 

-10 -5 l 

CAA CAC TIG TCC GCT TCC CAC TTC GTT GAA GCT TTC TAC TTC GTT TCC 350 
Gin His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr Leu Val Cys 
35 5 io 15 

GCT GAA AGA GCT TTC TTC TAC ACT OCT AAG GCT GCT AAG GCT AIT GTC 398 

Gly Glu Arg Gly Hie Hie Tyr Thr Pro Lys Ala Ala lys Gly lie Val 

20 25 30 35 
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GAA CAA TGC TCT AOC TOC ATC TGC TOC TIG TAC CAA TTC GAA AAC TAG 446 

Glu Gin Cys Cys Thr Ser lie cys Ser Leu Tyr Gin Leu Glu Asn Tyr 

40 45 50 

TGC AAC TAGAOGCAGC COGCAGGCTC TAGA 476 

5 Cys Asn 



(2) INK3KMA3TCN FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) IENGSH: 124 amino acids 
10 (B) TYPE: amino acid 

(D) TOFOI0GY: linear 

(ii) M3LBCCJIE TYPE: protein 

(Xi) SEQUENCE EESCRIPriCN: SEQ ID NO: 3: 

Met Lys l£u Lys Thr Val Arg Ser Ala Val Leu Ser Ser Leu Hie Ala 
-71 -70 -65 -60 

Ser Gin Val Leu Gly Gin Pro lie Asp Thr Arg lys Glu Gly Lai Gin 
-55 -50 -45 -40 

His Asp Tyr Asp Thr Glu lie Leu Glu His lie Gly Ser Asp Glu Leu 

-35 -30 -25 

2D lie l£0i Asn Glu Glu Tyr Val lie Glu Arg Thr Lai Gin Ala lie Asp 

-20 -15 -io 

AsnThrThrLeuAlal^ArglheValAsnGlnH^ 
-5 15 

His Leu Val Glu Ala Lai Tyr Leu Val Cys Gly Glu Arg Gly Phe Phe 
25 10 15 20 25 

Tyr Thr Pro Lys Ala Ala lys Gly lie Val Glu Gin Cys Cys Thr Ser 

30 35 40 

lie Cys Ser Leu Tyr Gin Leu Glu Asn Tyr Cys Asn 

45 50 

30 (2) INFOTMATTCN FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 450 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 
35 (D) TOPOIOCT: linear 

(ii) MDLECUIE TYPE: ENA 

(iii) HYPOTHETICAL: NO 
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(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(ix) FEAIURE: 
5 (A) NAME/KEY: CDS 

(B) LOCATION: 76.. 441 

(ix) FEAIURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 76.. 267 

10 (ix) FEATURE: 

(A) NAME/KEY: matjeptide 

(B) LOCATION: 268.. 441 



(xi) SEQUENCE DESCRHTICN: SEQ ID NO: 4: 
GAAITCAITC AAGAATAGTT CAAACAAGAA GATEACAAAC TATCAATITC ATACACAATA 60 

15 TAAAOGAITA AAAGA ATS AAG GCT GIT TTC TIG CTT TIG TCC TIG ATC GGA HI 

Met lys Ala Val Rie Leu Val Leu Ser Lai He Gly 

-60 -55 



TIC TEC TSG GCC CAA CCA TOG AAA TIG AAA OCA GCT AGC GAT ATA CAA 159 
HieC^TlpMaGlnProSerlysleuIysto 
21 -50 -45 -40 

AIT CTT TAC GAC CAT GGT GIG AGG GAG TTC GGG GAA AAC TAT GTT CAA 207 
He l£u Tyr Asp His Gly Val Arg Glu Rie Gly Glu Asn Tyr Val Gin 
-35 -3o -25 

GAG TTG ATC CTT AAC AOC ACT TIG GCT AAC CTC GOC ATG GCT GAG AGA 255 
25 Glu leu He Asp Asn Thr Thr Leu Ala Asn Val Ala Met Ala Glu Arg 
" 20 "15 -io -5 

TTC GAG AAG AGA AGG OCT GAT TTC TCT TTG GAA OCT OCA TAC ACT GCT 303 
Ifiu Glu lys Arg Arg Pro Asp Rie Cys Leu Glu Pro Pro Tyr Thr Gly 

1 5 io 

30 CCA TCT AAA GCT AGA ATC ATC AGA TAC TTC TAC AAC GCC AAG GCT GCT 351 
Pro Cys lys Ala Arg He He Arg Tyr Rie Tyr Asn Ala lys Ala Gly 
15 20 25 

TTG TCT CAA ACT TIC GET TAC GCT GGC TSC AGA GCT AAG AGA AAC AAC 399 
- c Of 5 ^hr Val Tyr Gly Gly Cys Arg Ala Lys Arg Asn Asn 
35 30 35 40 

TTC AAG TCT GCT GAA GAC TGC ATG AGA ACT TCT GCT GCT GOC 441 
Rie lys Ser Ala Glu Asp Cys Met Arg Thr cys Gly Gly Ala 
45 50 55 



TAATCTAGA 



450 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 122 amino acids 

(B) TYPE: amino acid 
5 (D) TOFOIOGY: linear 

(ii) M3EECUIE TYPE: protein 

(xi) SEQUENCE INSCRIPTION: SEQ ID NO: 5: 

Met lys Ala Val Fhe Leu Val Lai Ser Leu lie Gly Phe Cys Trp Ala 
-64 -60 -55 -50 

10 Gin Pro Ser Iys Leu Lys Pro Ala Ser Asp lie Gin lie Leu Tyr Asp 

-45 -40 -35 

His Gly Val Arg Glu Fhe Gly Glu Asn Tyr Val Gin Glu Leu lie Asp 
-30 -25 -20 

Asn Thr Ibr Leu Ala Asn Val Ala Met Ala Glu Arg Leu Glu Iys Arg 
15 -15 -io -5 

Arg Pro Asp Fhe Cys Leu Glu Pro Pro Tyr Thr Gly Pro Cys lys Ala 
1 5 io is 

Arg lie lie Arg Tyr Phe Tyr Asn Ala lys Ala Gly Leu Cys Gin Thr 

20 25 30 

20HieValTVrGlyGlyt^ArgAlaIysA 

35 40 45 

Glu Asp cys Met Arg Thr cys Gly Gly Ala 
50 55 

(2) INPQRMATICN FOR SEQ ID NO: 6: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) UENG0H: 470 base pairs 

(B) TOPE : nuc leic acid 

(C) STOANDEDNESS: single 

(D) TQP0IOGY: linear 

30 (ii) MDIECUIE TYPE: ENA 
(iii) HYPOTHETICAL: NO 
(iii) ANTE-SENSE: NO 



(Vi) ORIGINAL S( 

(A) ORGANISM: synthetic 

35 (ix) FEAIURE: 

(A) NAME/KEY: CDS 

(B) IOCATICN: 81.. 461 
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(ix) FEATURE: 

(A) NAME/KEY: siq_peptide 

(B) LOCATION: 81.. 287 

(ix) FEATURE: 
5 (A) NAME/KEY: iiratjxptide 

(B) LOCATION: 288.. 461 



(xi) SEQUENCE DESCKCPEICN: SBQ ID NO: 6: 

GAAITCAITC AAGAAIAGIT CAAACAAGAA GAITACAAAC TAICAAITTC ATACACAATA 60 

TAAAOaOQG TAGCAAAATA ATS AAA CTC AAA ACT GTA AGA TCT GOG CTC 110 
10 Met Lys Leu lys Thr Val Arg Ser Ala Val 

-69 -65 -60 

CTT TOS TCA CTC ITT GCA TCT CAG GTC CIT GGC CAA OCA TOG AAA TTG 158 
IfiU Ser Ser Leu Hie Ala Ser Gin Val Lai Gly Gin Pro Ser lys Leu 

-55 -50 -45 

15 AAA OCA GCT AGC GAT ATA CAA ATT CTT TAC GAC CAT GCT GTC AGG GAG 206 

lys Pro Ala Ser Asp lie Gin lie Lea Tyr Asp His Gly Val Arg Glu 

-40 -35 -30 

TIC GGG GAA AAC TAT CTT CAA GAG TIG ATC GAT AAC AOC ACT TTG GCT 254 
Hie Gly Glu Asn Tyr Val Gin Glu Leu lie Asp Asn Thr Thr Leu Ala 
20 "25 -20 -15 

AAC CTC GOC ATG GCT GAG AGA TIG GAG AAG AGA AGG OCT GAT TIC TCT 302 
Asn Val Ala Met Ala Glu Arg Leu Glu lys Arg Arg Pro Asp Phe Cys 
-10 -5 x 5 

TTG GAA OCT OCA TAG ACT GCT OCA TCT AAA GCT AGA ATC ATC AGA TAC 350 
25 leu Glu Pco Pro Tyr Thr Gly Pro Cys Lys Ala Arg lie lie Arg iyr 

10 15 20 

TTC TAC AAC GOC AAG GCT GCT TTG TCT CAA ACT TIC GOT TAC GCT GGC 398 
Hie Tyr Asn Ala lys Ala Gly Leu Cys Gin Thr Hie Val Tyr Gly Gly 

25 30 35 

30 TGC AGA GCT AAG AGA AAC AAC TIC AAG TCT GCT GAA GAC TSC ATG AGA 446 
Cys Arg Ala lys Arg Asn Asn Hie lys Ser Ala Glu Asp Cys Met Arg 
40 45 50 

ACT TCT GCT GCT GOC TAATCTAGA 470 

. Thr Cys Gly Gly Ala 

35 55 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 amino acids 

(B) TYPE: amino acid 
40 (D) TOPOLOGY: linear 
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(ii) MDLECOIE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Lys Leu lys Hhr Val Arg Ser Ala Val Leu Ser Ser Lai Phe Ala 
-69 -65 -60 -55 

5 Ser Gin Val Lai Gly Gin Pro Ser lys Leu Lys Pro Ala Ser Asp lie 

-50 ^45 -40 

Gin lie Leu Tyr Asp His Gly Val Arg Glu Phe Gly Glu Asn Tyr Val 
-35 -30 -25 

Gin Glu Leu lie Asp Asn Hir Thr Leu Ala Asn Val Ala Met Ala Glu 
10 -20 -is -io 

Arg leu Glu lys Arg Arg Pro Asp Ehe Cys Leu Glu Pro Pro Tyr Thr 
" 5 1 5 10 

Gly Pro Cys lys Ala Arg lie lie Arg Tyr Ehe Tyr Asn Ala Lys Ala 

15 20 25 

15 Gly Dai Cys Gin Thr Phe Val Tyr Gly Gly Cys Arg Ala lys Arg Asn 

30 35 40 

Asn Hie Lys Ser Ala Glu Asp Cys Met Arg Thr Cys Gly Gly Ala 
45 50 55 

(2) INFORMATION FOR SEQ ID NO: 8: 

23 (i) SEQUENCE CHARACTERISTICS: 

(A) UNCTH: 928 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) T0P0IOCT: linear 

25 (ii) MOIECCHE TYPE: cENA 

(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

30 (ix) FEAIURE: 

(A) NAME/KEY: CDS 

(B) IOCATICN: 8.. 919 

(ix) FEAIURE: 

(A) NAME/KEY: siq_peptide 
35 (B) IOCATiaN: 8,. 91 

(ix) FEAIURE: 

(A) NAME/KEY: mat_j>eptide 

(B) LOCATION: 92.. 919 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

CTOGAOC ATG ATT TAG ACA ATC AAG AAA CTA CAT GCA CTT TGG GCT AGC 49 
Met lie Tyr Thr Met Lys Lys Val His Ala Leu Trp Ala Ser 
"28 -25 -20 -15 

5 CTA TGC CTG CTC CTT AAT CTT GOC OCT GCC OCT CTT AAT GCT GAT TCT 97 

Val Cys Leu Leu Lai Asn Leu Ala Pro Ala Pro Leu Asn Ala Asp Ser 

-10 -5 i 

GAG GAA GAT GAA GAA CAC ACA ATT ATC ACA GAT AOG GAG CTC OCA CCA 145 
Glu Glu Asp Glu Glu His Thr lie He Thr Asp Thr Glu Lai Pro Pro 
10 5 io is 

CDG AAA CTT ATC CAT TCA TTT TCT GCA TIC AAG GOG GAT GAT GGG CCC 193 
l£Ulysl£UMetHisSerRieCys Ala Hie lys Ala Asp Asp Gly Pro 
20 25 30 

_ TCT AAA GCA ATC ATC AAA AGA TIT TIC TTC AAT ATT TTC ACT OGA CAG 241 
15 Cys lys Ala lie Met lys Arg Hie Hie Hie Asn lie Hie Thr Arg Gin 
35 40 45 50 

TGC GAA GAA TIT ATA TAT GGG GGA TCT GAA GGA AAT CAG AAT OGA TTT 289 
Cys Glu Glu Hie lie Tyr Gly Gly Cys Glu Gly Asn Gin Asn Arg Phe 

55 60 65 

20 GAA ACT CTG GAA GAG TGC AAA AAA ATG TCT AGA AGA GAT AAT GCA AAC 337 

Glu Ser leu Glu Glu Cys lys lys Met Cys Thr Arg Asp Asn Ala Asn 

70 75 80 

AGG ATT ATA AAG ACA ACA CDG CAG GAA GAA AAG OCA GAT TIC TGC TIT 385 
Arg lie lie lys Thr Thr Leu Gin Gin Glu lys Pro Asp Hie Cys Hie 
25 85 90 95 

TEG GAA GAG GOT OCT GGA AHA TCT OGA GCT TAT ATT ACC AGG TAT TTT 433 

leu Glu Glu Asp Pro Gly lie Cys Arg Gly Tyr lie Thr Arg Tyr Hie 
100 105 no 

TAT AAC AAT CAG ACA AAA CAG TCT GAA AGG TTC AAG TAT GCT GGA TGC 481 
30 Tyr Asn Asn Gin Thr lys Gin Cys Glu Arg Hie lys Tyr Gly Gly Cys 
H5 120 125 130 

CDG GGC AAT ATG AAC AAT TTT GAG ACA CTC GAG GAA TGC AAG AAC AIT 529 

leu Gly Asn Met Asn Asn Hie Glu Thr Leu Glu Glu cys lys Asn lie 

135 140 145 

35 TCT GAA GAT GCT O0G AAT GCT TTC CAG GIG GAT AAT TAT GCT ACC CAG 577 
Cys Glu Asp Gly Pro Asn Gly Hie Gin Val Asp Asn Tyr Gly Thr Gin 

150 155 160 

CTC AAT GCT CTT AAC AAC TCC CIG ACT COG CAA TCA ACC AAG GIT CCC 625 

leu Asn Ala Val Asn Asn Ser Leu Thr Pro Gin Ser Thr lys Val Pro 
40 165 170 175 
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AGC CIT TTT GAA TIC CAC OCT CCC TCA TOG TGT CDC ACT OCA GCA GAT 673 
Ser Lai Ihe Glu Fhe His Gly Pro Ser Trp Cys Leu Hir Pro Ala Asp 
180 185 190 

AGA GGA TIG TCT OCT GOC AAT GAG AAC AGA TIC TAG TAC AAT TCA GIC 721 
5 Arg Gly Leu cys Arg Ala Asn Glu Asn Arg Fhe Tyr Tyr Asn Ser Val 

195 200 205 210 

ATT GGG AAA TGC OGC OCA TTT AAG TAC TCC GGA TCT GGG GGA AAT GAA 769 
lie Gly Iys Cys Arg Pro Hie Lys Tyr Ser Gly Cys Gly Gly Asn Glu 

215 220 225 

10 AAC AAT TTT ACT ACT AAA CAA GAA TCT CTC AGG GCA TGC AAA AAA OCT 817 
Asn Asn Etae Hir Ser Iys Gin Glu Cys Leu Arg Ala Cys Iys Iys Gly 

230 235 240 

TIC ATC CAA AGA ATA TCA AAA GGA GGC CTA ATT AAA ADC AAA AGA AAA 865 
Fhe lie Gin Arg lie Ser Lys Gly Gly Leu lie Lys Thr Iys Arg Iys 
15 245 250 255 

AGA AAG AAG CAG AGA GTTG AAA ATA GCA TAT GAA GAA ATT TIT GTT AAA 913 
Arg Iys Iys Gin Arg Val Iys lie Ala Tyr Glu Glu lie Fhe Val Iys 
260 265 270 

AAT ATC TGACT0GAC 92 8 
20 Asn Met 
275 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 
(A) IfKGlH: 304 amino acids 
25 (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) M0IECUIE TYPE: protein 

(Xi) SEQUENCE DESCRIPnCN: SEQ ID NO: 9: 

Met lie Tyr Thr Met Iys Iys Val His Ala Lai Trp Ala Ser Val Cys 
30 -28 -25 -20 -15 

leu leu Leu Asn Leu Ala Pro Ala Pro Leu Asn Ala Asp Ser Glu Glu 
-10 -5 i 

Asp Glu Glu His Hir lie lie Ihr Asp Thr Glu Leu Pro Pro Leu Iys 
5 10 15 20 

35 leu Met His Ser Fhe Cys Ala Fhe Lys Ala Asp Asp Gly Pro Cys Iys 

25 30 35 

Ala lie Met Lys Arg Fhe Hie Fhe Asn lie fhe Ihr Arg Gin Cys Glu 

40 45 50 
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Glu Fhe lie Tyr Gly Gly Cys Glu Gly Asn Gin Asn Arg Phe Glu Ser 
55 60 65 

leu Glu Glu Cys lys lys Met Cys Ihr Arg Asp Asn Ala Asn Arg lie 
70 75 80 



5 lie lys Ihr Ihr Leu Gin Gin Glu Lys Pro Asp Phe Cys Phe 
85 90 95 



Leu Glu 
100 



Glu Asp Pro Gly lie cys Arg Gly Tyr lie Ihr Arg Tyr Fhe Tyr Asn 

105 iiO us 

Asn Gin Ihr lys Gin Cys Glu Arg Ifce lys Tyr Gly Gly Cys Leu Gly 
10 120 125 130 

Asn Met Asn Asn Fhe Glu Ihr Leu Glu Glu Cys Lys Asn lie Cys Glu 
135 140 145 

Asp Gly Pro Asn Gly Fhe Gin Val Asp Asn Tyr Gly Ihr Gin Leu Asn 
150 155 160 

15 Ala Val Asn Asn Ser Leu Ihr Pro Gin Ser Ihr lys Val Pro Ser Leu 
165 170 175 180 

Fhe Glu Ihe His Gly Pro Ser Trp Cys Leu Ihr Pro Ala Asp Arg Gly 

185 190 195 



20 



leu Cys Arg Ala Asn Glu Asn Arg Fhe Tyr Tyr Asn Ser Val lie Gly 



200 

lys cys Arg Pro 
215 



205 



210 



Fhe lys Tyr Ser Gly Cys Gly Gly Asn Glu Asn 

220 225 



Ihe Ihr Ser lys Gin Glu cys Leu Arg Ala Cys lys lys Gly Ihe lie 
230 235 240 

25 Gin Arg lie Ser Lys Gly Gly Leu lie Lys Ihr lys Arg Lys Arg lys 

245 250 255 260 

lys Gin Arg Val Lys lie Ala Tyr Glu Glu lie Fhe Val lys Asn Met 

265 270 275 



(2) INPQRMftnCN FOR SBQ ID NO: 10: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) IENCTH: 234 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOLOGY: linear 

35 (ii) MDIECUIE TYPE: ENA 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(ix) FEAIURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 76. .234 

(ix) FEATURE: 

(A) NAME^KEY: sig_peptide 

(B) LOCATION: 76. .222 



10 



(ix) FEAR 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 223.. 234 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GAATTCATTC AAGAATAGTT CAAACAAGAA GA1TACAAAC TATCAA2TTC ATACACAATA 60 
TAAAOGA2TA AAAGA ATG AAG GCT GIT TTC TIG GIT TIG TCC TIG ATC GGA 111 

15 Met lys Ala Val Hie Leu Val Leu Ser Lai lie Gly 

-49 -45 -40 

TTC TGC TOG GCC CAA CCA GTC ACT GGC GAT GAA TCA TCT GOT GAG ATT 159 

Hie Cys Trp Ala Gin Pro Val Uir Gly Asp Glu Ser Ser Val Glu lie 
-35 -30 -25 

20 COG GAA GAG TCT CIG ATC ATC GCT GAA AAC ACC ACT TIG GCT AAC GTC 207 
Pro Glu Glu Ser Leu lie lie Ala Glu Asn Thr Thr Lai Ala Asn Val 
-20 -is -io 

GCC ATS GCT AAG AGA GAT TCT GAG GAA 234 
Ala Met Ala Iys Arg Asp Ser Glu Glu 
25 -5 i 

(2) INFORMATION FOR SEQ ID NO: U: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGIH: 53 amino acids 

(B) TYPE: amino acid 
30 (D) TOPOLOGY: linear 

(ii) MOIECUIE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: U: 

Met lys Ala Val Hie Leu Val leu Ser leu lie Gly Hie Cys Trp Ala 
-49 -45 -40 -35 

35 Gin Pro Val Thr Gly Asp Glu Ser Ser Val Glu lie Pro Glu Glu Ser 

-30 -25 -20 

leu lie lie Ala Glu Asn Thr Thr leu Ala Asn Val Ala Met Ala lys 
-15 -io -5 
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Arg Asp Ser Glu Glu 
1 

(2) INFORMATION FOR SBQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) IEKGIH: 190 base pairs 

(B) TYPE: nuc leic acid 

(C) S1RANEEENESS: single 

(D) TOPOLOGY: linear 

(ii) MOIBCCHE TYPE: ENA 
10 (iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(ix) FEAIURE: 
15 (A) NAME^KEY: CDS 

(B) LOCATION: 17.. 190 

(ix) FEAIURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 17.. 178 

20 (ix) FEAIURE: 

(A) NAME/KEY: rat_peptide 

(B) LOCATION: 179.. 190 

(Xi) SEQPENCE EESCKEETICN: SBQ ID NO: 12: 
GAA3TCAAAC TAAAAA AUG AAG CTT AAA ACT CTA AGA TCT GCG GTC C37T 49 

25 Met lys Leu Lys Thr Val Arg Ser Ala Val Lai 

-50 -45 



TOG TCA CTC TIT GCA TOG CAG GTC CEA GGT CAA OCA GTC ACT GGC GOT 97 
Ser Ser Leu Ehe Ala Ser Gin Val Leu Gly Gin Pro Val Thr Gly Asp 

-40 -35 -30 

30 GAA TCA TCT GIT GAG ATT COG GAA GAG TCT CIG ATC ATC GCT GAA AAC 145 

Glu Ser Ser Val Glu lie Pro Glu Glu Ser Leu lie lie Ala Glu Asn 
-25 -20 -15 

AOC ACT TIG GCT AAC GTC GOC ATC GCT AAG AGA GAT TCT GAG GAA 190 
Uir Hir Leu Ala Asn Val Ala Met Ala lys Arg Asp Ser Glu Glu 
35 -10 -5 1 



(2) INPQRMAI10N FOR SEQ ID NO: 13: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGEH: 58 amino acids 
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(B) TYPE: amino acid 
(D) TOFOIOGY: linear 

(ii) M3IZOJIE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 13: 

5 Met Iys Leu Iys Thr Val Arg Ser Ala Val Leu Ser Ser Leu Fhe Ala 
-54 -50 -45 -40 

Ser Gin Val Leu Gly Gin Pro Val Thr Gly Asp Glu Ser Ser Val Glu 

-35 -30 -25 

lie Pro Glu Glu Ser Leu lie lie Ala Glu Asn Thr Thr Lai Ala Asn 
10 -20 -is -io 

Val Ala Met Ala Iys Arg Asp Ser Glu Glu 
-5 l 

(2) INFORMATION FOR SBQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) IENGIH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 

(D) TOPOIOCT: linear 

(ii) MDIECUIE TYPE: ENA 
20 (vi) ORIGINAL SOURCE: 



(A) ORGANISM: synthetic 



(xi) 



SEQUENCE 



PT1CN: SBQ ID NO: 14: 




27 



(2) INFORMATION FOR SEQ ID NO: 15: 



25 




(C) STRANEEENESS: single 

(D) TDFOIOGY: linear 
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(li) MOLECULE TYPE: ENA 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



CAAOCAATAG ACAOGOGEAA AGAAGGCCEA CAGCAIGATT AOGATACAGA GATCITCGAG 



60 
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(2) INFORMATION FOR SBQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) U20H: 62 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 

(D) TDPOIOGY: linear 

(ii) MOIECUIE TYPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

OCAAGATCTC TCTATOCTAA TCATGCTGEA GGULT1LT1T AOGCGTCTCT AITCGTIGGG 60 
OC 

62 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) IENGIH: 87 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 

(D) TOPOLOGY: linear 

(ii) MOIECUIE TYPE: ENA 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(xi) SEQUENCE INSCRIPTION: SEQ ID NO: 17: 
GIAOCAAAAT AATGAAACDG AAAACTCTAA GAICTQOGGT CCTTTOGTCA CTC3TIGCAX 60 
CTCAGGTOCT TQGOCAAOCA ATAGACA 87 
25 (2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) IENGIH: 87 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 
30 (D) TOPOIDGY: linear 

(ii) MOIECUIE TYPE: ENA 

* 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 

(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 18: 
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CGasrarcm tdqgtt 



Z AAGGAOCTGA GATGCAAAGA CTG&OGAAAG GAOOGCAGAT 



60 



CTEACACTIT TCACTITCEA TATTTTG 



87 



(2) INFORMATION FOR SEQ ID NO: 19: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) UNGIH: 9 base pairs 

(B) TOTE: nucleic acid 



5 




(ii) 



MDIECUIE TYPE: ENA 



10 



(vi) 



ORIGINAL SOURCE: 
(A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 19: 
TAAOCTOGC 

(2) INFORMATION FOR SEQ ID NO: 20: 



20 (ii) MDIECUIE TYPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE EESCRIEEICN: SEQ ID NO: 20: 
OEGGOGAOG 
25 (2) INFORMATION FDR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£MGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEENESS: single 
30 (D) TOPOLOGY: linear 

(ii) MOLBCCJIE TYPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



15 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 
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CITQGCCAAC CAIOGAAAIT GAAACCAG 28 

(2) JmmKTIW FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 
(A) IiNGIH: 28 base pairs 
5 (B) TYPE: nucleic acid 

(C) S1KANDEENESS: single 

(D) TOPOLOGY: linear 

(ii) MDIECU1E TYPE: ENA 

(vi) ORIGINAL SOURCE: 
10 (A) ORGANISM: synthetic 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

CTAGCIQCTT TCAATITOGA TQGTTOGC 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) IENGIH: 88 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 

(D) TOPOLOGY: linear 

(ii) MOIECULE TYPE: ENA 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
AA1TCAAACT AAAAAATCAA GCTEAAAACT GEAAGATCTG OGGTOCTTIC GICACICTIT 60 
GCATOGCAGG TOCTAGGTCA AOCACTCA 88 
25 (2) INFOraiATICN FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) I£NGXH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANEEENESS: single 
30 (D) TOPOLOGY: linear 

(ii) MDIECUIE TYPE: ENA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: synthetic 



(xi) SEQUENCE EESCRIPTICN: SEQ ID NO: 24: 
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CiU?ntSAOC TAGGACCIGC GATCCAAAGA GIGAOGAAAG GAOOGCAGAT CTIAGACTTT 60 
TCAGCTTCAT T1T1TAGTIT G 81 



WO 95/02059 



FCT7DK94/00281 



37 

CLAIMS 

1. A DNA construct comprising the following sequence 

5 • -P-SP- (LP) n -PS-HP-3 • 

wherein 

5 P is a promoter sequence, 

SP is a DNA sequence encoding the yeast aspartic protease 3 
(YAP3) signal peptide, 

LP is a DNA sequence encoding a leader peptide, 
n is 0 or 1, 

10 PS is a DNA sequence encoding a peptide defining a yeast 
processing site, and 

HP is a DNA sequence encoding a polypeptide which is 
heterologous to a selected host organism. 

2. A DNA construct according to claim 1, wherein the promoter 
15 sequence is selected from the Saccharomvces cerevisiae MFal, 

TPI, ADH, BAR1 or PGK promoter, or the Schizosaccharomvces 
pombe ADH promoter . 

3. A DNA construct according to claim 1, wherein the YAP3 
signal peptide is encoded by the following DNA sequence 

20 ATG AAA CTG AAA ACT GTA AGA TCT GCG GTC CTT TCG TCA CTC TTT GCA 
TCT CAG GTC CTT GGC (SEQ ID No:l) 

or a suitable modification thereof encoding a peptide with a 
high degree of homology to the YAP3 signal peptide. 

4. A DNA construct according to claim 1, wherein n is 1. 

25 5. A DNA construct according to claim 5, wherein the leader 
peptide is a yeast MFal leader peptide or a synthetic leader 
peptide . 
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6. A DNA construct according to claim 1, wherein PS is a DNA 
sequence encoding Lys-Arg, Arg-Lys, Lys-Lys, Arg-Arg or Ile- 
Glu-Gly-Arg . 

7. A DNA construct according to claim 1, wherein the 
5 heterologous polypeptide is selected from the group consisting 

of aprotinin, tissue factor pathway inhibitor or other protease 
inhibitors, insulin or insulin precursors, human or bovine 
growth hormone, interleukin, glucagon, glucagon-like peptide 1, 
tissue plasminogen activator, transforming growth factor a or 
10 0, platelet-derived growth factor, enzymes, or a functional 
analogue thereof. 

8. A DNA construct according to claim 1, which further 
comprises a transcription termination sequence. 

9. A DNA construct according to claim 8, wherein the 
15 transcription termination sequence is the TPI terminator. 

10. A recombinant expression vector comprising a DNA construct 
according to any of claims 1-9. 

11. A cell transformed with a vector according to claim 10. 

12. A cell according to claim 11, which is a fungal cell. 

20 13. A cell according to claim 12, which is a yeast cell. 

14. A cell according to claim 13, which is a cell of 
Saccharomyces , Schizos accharomyces r Kluweromvces . Hansenula or 
Yarrowia . 



15. A cell according to claim 14, which is a cell of 
25 Saccharomyces cerevisiae or Schizosaccharomvces oombe. 
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16. A method of producing a heterologous polypeptide, the 
method comprising culturing a cell which is capable of 
expressing a heterologous polypeptide and which is transformed 
with a DNA construct according to any of claims 1-9 in a 
5 suitable medium to obtain expression and secretion of the 
heterologous polypeptide, after which the heterologous 
polypeptide is recovered from the medium. 
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Sphl(l) 



1/13 




Ncol (561) 
Xbal (752) 



mp!9RF: Sphl-Xbal 




752 bp Sphl-Xbal 



Xbal 



1 0 kb Xbal-Sphl 
+ 

1 9 1 bp Ncol-Xbal 




j 



Sphl(l) Asp718(442) 



Ncol (560) 



Primer annealed, 
extended and ligated: 
Klenow polymerase + dNTP 
+ T4 DNA ligase + ATP 



Xbal (751) 



1 



pLaCI 965 
Asp718 



561 bp Sphl-Ncol 



Fig. 1a 
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Sphl(1) Apal(489) 




Fig. 1b 
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10 20 30 40 50 60 

I I I I I | 

GG AATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTC 



? 0 80 90 100 110 120 

I I I I | | 

ATAAACGACGGTACCAAAATAATGAAACTGAAAACTGTAAGATCTGCGGTCCTTTCGTCA 

METLysLeuLysThrValArgSerAlaValLeuSerSer 
YAP3 SP 

130 140 150 160 170 180 

I I I I | | 

CTCTTTGCATCTCAGGTCCTTGGCCAACCAATAGACACGCGTAAAGAAGGCCTACAGCAT 

LeuPheAlaSerGlnValLeuGlyGlnProIleAspThrArgLysGluGlyLeuGlnHis 
************************************ 

190 200 210 220 230 240 

1 III | | 

GATTACGATACAGAGATCTTGGAGCACATTGGAAGCGATGAGTTAATTTTGAATGAAGAG 
AspiyrAspThrGluIl eLeuGluHis I leGly Ser AspGluLeuI 1 eLeuAsnGluGlu 
****************** 63 >15d3 leader ************************** 

250 260 270 280 290 300 

I I | | | | 

TATGTTATTGAAAGAACTTTGCAAGCCATCGATAACACCACTTTCGCTAAGAGATTCGTT 
TyrVallleGluArgThrLeuGlnAlalleAspAsnThrThrLeiiAlaLysArgPheVal 

310 320 330 340 350 360 

I I I I | | 

AACCAACACTTGTGCGGTTCCCACTTGGTTGAAGCTTTGTACTTGGTTTGCGGTGAAAGA 
AsnGlnHi sLeuCy sGly SerHi sLeuValGlxaAlaLeuTyrLeuValCysGlyGluArg 
>»»»»»»»»» Insulin precursor MI3 »»»»»»»»» 

370 380 390 400 410 420 

oomm™ 1 I I I I I 

GGTTTCTTCTACACTCCTAAGGCTGCTAAGGGTATTGTCGAACAATGCTGTACCTCCATC 
^ly^ePheTyrThrProLysAlaAlaLysGlylleValGluGlnCysCysThrSerlle 

430 440 450 460 470 

I I I | | 

TGCTCCTTGTACCAATTGGAAAACTACTGCAACTAGACGCAGCCCGCAGGCTCTAGA 
CysSerLeuTyrGlnLeuGluAsnTyrCysAsn 



Fig. 2 
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Sphl (1) Apal(489) 



Sphl(1) 




Xbal (860) 



Apal (489) 

Xbal (770) 



\ 419 bp Sphl-Apal 

367 bp Apal-Xbal 
+10 kb Xbal-sph 

Sphl (1) Apal (489) 

Ddel (675) 
Xbal (860) 




pKFN1003:204 bp 
Ncol-Xbal 





302 bp 
EcoRI-Ddel 



Sphl(l) 



373 bp Sphl-EcoRI 
+ 1 0 kb Xbal-Sphl 




Clal (502) 

Clal (659) 
Ncol (687) 
Xbal (890) 



516bp 
EcoRI- 
Xbal 




Fig. 3a 
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Sphl (1) Clal (502) 




i 

10 kb Clal-Clal dephosphorylated 

+ 

S.c. MT663 DNA digest with 
5' CG tails 

I . 

Library of appr. 1 0 5 p APRSc's 

■ i 

Appr. 10° yeast transformants 
screened for trypsin inhibition 

i 

pAPR.Sc1 isolated. 



Fig. 3b 
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10 20 30 40 50 60 

I I I I I I 

GAATTCATTCAAGAATAGTTCAAACAA 



70 80 90 100 110 120 

I I I i i I 

TAAACGATTAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTC 

METLysAlaValPheLeuValLeuSerLeuIleGlyPheCysTrp 
spx 3 

130 140 150 160 170 180 

I I I I I I 

GCCCAACCATCGAAATTGAAACCAGCTAGCGATATAGAAA 

ZlaGlnProSerLysLeuLysProAlaSerAspIleGlnlleLeuTyrAspHisGlyVal 



190 200 210 220 230 240 

I I I I I I 

AGGGAGTTCGGGGAAAACTATGTTCAAGAGTTGATCGATAACACCACTTTC 
ArgGluPheGlyGluAsnTyrValGlnGluLeuIleAsphsn^hrThrheuAlahsnVal 



250 260 270 280 290 300 

I I I I I I 

GCCATGGCTGAGAGATTGGAGAAGAGAAGGCCTGATTTCTGTTT 

AlaMetAlaGl\iArgLeuGluLysArgArgProAspPheCysLeuGluProProTyrThr 

310 320 330 340 • 350 360 

I I I I I I 

GGTCCATGTAAAGCTAGAATCATCAGATACTTCTACAACGCC 
GlyProC^sLysAlaArgllelleArgTyrPh^ 

370 380 390 400 410 420 

I I I I I I 

ACTTTCGTTTACGGTGGCTGCAGAGCTAAGAGAAACAAC 

ThrPheValTyrGlyGlyCysArgAlaLysArgAsnAsnPheLysSerAlaGluAspCys 

430 440 450 

I I I 

ATGAGAACTTGTGGTGGTGCCTAATCTAGA 

METArgThrCy s G ly G ly Al a 

»»»>»»»»»»» 



Fig. 4 
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10 20 30 40 50 60 

I 1 I I I I 

GGAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATC 



70 80 90 100 110 120 

I I I I I I 

ATAAACGACGGTACCAAAATAATGAAACTGAAAACTGTAAGATCTGCG 

METLysLeuLysThrValArgSerAlaValLeuSerSer 
YAP3sp 

130 140 150 160 170 180 

I I I I I I 

CTCTTTGCATCTCAGGTCCTTGGCCAACCATCGAAA 

LeuPheAlaSerGlnValLeuGlyGlnProSerLysLeuLysProAlaSerAspIleGln 



190 200 210 220 230 240 

I I I I I I 

ATTCTTTACGACCATGGTGTGAGGGAGTTCGGGGAAAACT 

IleLeuTyrAspHisGlyValArgGluPheGlyGluAsnTyrValGlnGluLeuIleAsp 



250 260 270 280 290 300 

I I I I I I 

AACACCACTTTGGCTAACGTCGCCATGGCTGAGAGATTGGAGA^ 

AsnThrThrLeuAlaAsnValAlaMetAlaGlxiArgLeuGluLysArgArgProAspPhe 

»»»»»» 

310 320 330 340 350 360 

I I I I I I 

TGTTTGGAACCTCCATACACTGGTCCATGTAAAGCTAGAATC^ 

CysLeuGluProProTyrThrGlyProCysLysAlaArgllelleArgTyrPheiyrAsn 

370 380 390 400 410 420 

I I I I I I 

GCCAAGGCTGGTTTGTGTCAAACTTTCGTTTACGGTC 

AlaLysAlaGlyLeuCysGlnThrPheValTyrGlyGlyCysArgAlaLysArgAsnAsn 

430 440 450 460 470 
I I I I I 
TTCAAGTCTGCTGAAGACTGCATGAGAACTTGTGGTGGTGCCTAATCTAGA 
PheLysSerAlaGluAspCysMetArgThrCysGlyGlyAla 

» 



Fig. 6 
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Sail 1^. 

GTCGACC ATG ATT TAC ACA ATG AAG AAA GTA CAT GCA CTT TGG GCT AGC 49 
Met He Tyr Thr Met Lys Lys Val His Ala Leu Trp Ala Ser 
" 28 "25 -20 -15 

§1? EE f TG P G f 71 ^ T F GCC CCT GCC CCT CT T AAT GCT GAT TCT 97 
Val Cys Leu Leu Leu Asn Leu Ala Pro Ala Pro Leu Asn Ala Asp Ser 

-10 -5 l 

Sad 

GAG GAA GAT GAA GAA CAC ACA ATT ATC ACA GAT ACG GAG CTC CCA CCA 145 
Glu Glu Asp Glu Glu His Thr He lie Thr Asp Thr Glu Leu Pro Pro 
5 10 15 



193 



fIS fJJ f 11 2 T 5 £ AT I CA 171 TGT GCA 170 fi CG GAT GAT gS'cCC 
Leu Lys Leu Met His Ser Phe Cys Ala Phe Lys Ala Asp Asp Gly Pro 

25 30 

III f5? S A fT C £ TG . ^ AGA m nc 170 AAT ATT TTC ACT CGA CAG 241 
Cys Lys Ala He Met Lys Arg Phe Phe Phe Asn He Phe Thr Arg Gin 
" 40 45 50 

r*5J r?? ^ U 1 " ?T A J AT S GG GGA TGT GAA GGA AAT CAG AAT^GA TTT 289 
Cys Glu Glu Phe lie Tyr Gly Gly Cys Glu Gly Asn Gin Asn Arg Phe 

55 60 65 

ffH SI P G Sf* £ AG I GC ^ AAA ATG TGT ACA AGA GAT AAT GCA AAC 337 
Glu Ser Leu Glu Glu Cys Lys Lys Met Cys Thr Arg Asp Asn Ala Asn 

/0 75 80 

SS fH ?I A f* 6 }? F TG CAG CA A GAA AAG CCA GAT TTC TGC TTT 385 
Arg He lie Lys Thr Thr Leu Gin Gin Glu Lys Pro Asp Phe Cys Phe 

85 go g 5 

BamHI 

Tit tft m G 5 AT » CT 5 GA ATA TGT CGA GGT TAT ATT ACC AGG TAT TTT 433 
Leu Glu Glu Asp Pro Gly He Cys Arg Gly Tyr lie Thr Arg Tyr Phe 
100 105 no 



TAT «.~ .„ AStuI 

JU ten tel n 5 f** Jf G I GT SJ* AGG m AAG TAT GGT GGA TGC 481 
fyr Asn Asn Gin Thr Lys Gin Cys Glu Arg Phe Lys Tyr Gly Gly Cys 

llt> 120 125 130 

EIS IS XI S3 ^ ^ IF S AG JP "C X 6A6 GAA TGC AAG AAC ATT 529 
Leu Gly Asn Met Asn Asn Phe Glu Thr Leu Glu Glu Cys Lys Asn He 

135 140 145 



Fig. 7a 
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Knpl 

TGT GAA GAT GGT CCG AAT GGT TTC CAG GTG GAT AAT TAT GGT ACC CAG 577 
Cys Glu Asp Gly Pro Asn Gly Phe Gin Val Asp Asn Tyr Gly Thr Gin 

150 155 160 

Hpai 

CTC AAT GCT GTT AAC AAC TCC CTG ACT CCG CAA TCA ACC AAG GTT CCC 625 

Leu Asn Ala Val Asn Asn Ser Leu Thr Pro Gin Ser Thr Lys Val Pro 
165 170 175 

EcoRI 

AGC CTT TTT GAA TTC CAC GGT CCC TCA TGG TGT CTC ACT CCA GCA GAT 673 
Ser Leu Phe Glu Phe His Gly Pro Ser Trp Cys Leu Thr Pro Ala Asp 
180 185 190 

AEcoRV 

AGA GGA TTG TGT CGT GCC AAT GAG AAC AGA TTC TAC TAC AAT TCA GTC 721 
Arg Gly Leu Cys Arg Ala Asn Glu Asn Arg Phe Tyr Tyr Asn Ser Val 
195 200 205 210 

BspMII 

ATT GGG AAA TGC CGC CCA TTT AAG TAC TCC GGA TGT GGG GGA AAT GAA 769 
He Gly Lys Cys Arg Pro Phe Lys Tyr Ser Gly Cys Gly Gly Asn Glu 

215 220 225 

Spel SphI 
AAC AAT TTT ACT AGT AAA CAA GAA TGT CTG AGG GCA TGC AAA AAA GGT 817 
Asn Asn Phe Thr Ser Lys Gin Glu Cys Leu Arg Ala Cys Lys Lys Gly 

230 235 240 

StuI 

TTC ATC CAA AGA ATA TCA AAA GGA GGC CTA ATT AAA ACC AAA AGA AAA 865 
Phe He Gin Arg He Ser Lys Gly Gly Leu He Lys Thr Lys Arg Lys 
245 250 255 

AGA AAG AAG CAG AGA GTG AAA ATA GCA TAT GAA GAA ATT TTT GTT AAA 913 
Arg Lys Lys Gin Arg Val Lys He Ala Tyr Glu Glu He Phe Val Lys 
260 265 270 

Sail 

AAT ATG TGAGTCGAC 928 

Asn Met 

275 



Fig. 7b 



WO 95/02059 



PCTADK94/00281 



11/13 



EcoRI 

5361 GAATTCATTCAAGAATAGTTCAAACAAGAAGATTACAAACTATCAATTTCATACACAAT 

5420 ATAAACGATTAAAAGAATGAAGGCTGTTTTCTTGGTTTTGTCCTTGATCGGATTCTGCT 

MetLysA I aVa I PheLeuVa I LeuSerLeu I leGI yPheCys 

spx3 signal peptide 



rnni BspEI Bell 
5479 GGGCCCAACCAGTCACTGGCGATGAATCATCTGTTGAGATTCCGGAAGAGTCTCTGATC 
TrpA I aG I nProVa I ThrG I yAspG I uSerSerVa I G I u 1 1 eProG I uG I uSerLeu 1 1 e 
♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦212 leader ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ 

5438 ATCGCTGAAAACACCACTTTGGCTAACGTCGCCATGGCTAAGAGAGATTCTGAGGAA 
1 1 eA I aG I uAsnThrThrLeuA I aAsnVa I A I aMetA I aLysArgAspSerG I uG I u— 
**♦♦♦*♦*•♦♦♦♦♦+ ♦♦♦♦♦♦♦♦♦♦♦♦♦♦♦ + ♦♦♦♦♦♦♦♦♦♦♦♦♦♦■•.< - Tppi 

Kex2 



Fig. 8a 
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EcoR I h j ndl II ft i t 

5361 GAATTCAAACTAAAAAATGAAGCTTAAAACTGTAAGATCTGCGGTCCTTTCGTCACTCT 

MetLysLeuLysThrVa I ArgSerA I aVa I LeuSerSerLeu 

"ap3 signal peptide 

Avp II p f | M J 

.............. 2 | 2 leader...........'..; 

5,79 Se^S^^ 

5538 AGATTCTGAGGAA — 
ArgAspSerG I uG I u — 
♦♦♦<-TFPI 



Fig. 8b 
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